Cheating to achieve Formal Concept Analysis over a large formal context
Identifieur interne : 002052 ( Main/Exploration ); précédent : 002051; suivant : 002053Cheating to achieve Formal Concept Analysis over a large formal context
Auteurs : Victor Codocedo [Chili] ; Carla Taramasco [France] ; Hernan Astudillo [Chili]Source :
Abstract
Researchers are facing one of the main problems of the Information Era. As more articles are made electronically available, it gets harder to follow trends in the different domains of research. Cheap, coherent and fast to construct knowledge models of research domains will be much required when information becomes unmanageable. While Formal Concept Analysis (FCA) has been widely used on several areas to construct knowledge artifacts for this purpose (Ontology development, Information Retrieval, Software Refactoring, Knowledge Discovery), the large amount of documents and terminology used on research domains makes it not a very good option (because of the high computational cost and humanly-unprocessable output). In this article we propose a novel heuristic to create a taxonomy from a large term-document dataset using Latent Semantic Analysis and Formal Concept Analysis. We provide and discuss its implementation on a real dataset from the Software Architecture community obtained from the ISI Web of Knowledge (4400 documents).
Url:
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream Hal, to step Corpus: 001459
- to stream Hal, to step Curation: 001459
- to stream Hal, to step Checkpoint: 001B03
- to stream Main, to step Merge: 002095
- to stream Main, to step Curation: 002052
Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en">Cheating to achieve Formal Concept Analysis over a large formal context</title>
<author><name sortKey="Codocedo, Victor" sort="Codocedo, Victor" uniqKey="Codocedo V" first="Victor" last="Codocedo">Victor Codocedo</name>
<affiliation wicri:level="1"><hal:affiliation type="laboratory" xml:id="struct-36850" status="VALID"><orgName>Departamento de Informatica [Valparaíso, Chile]</orgName>
<desc><address><addrLine>Av.España 1680 - Valparaíso</addrLine>
<country key="CL"></country>
</address>
<ref type="url">http://portal.inf.utfsm.cl</ref>
</desc>
<listRelation><relation active="#struct-406898" type="direct"></relation>
</listRelation>
<tutelles><tutelle active="#struct-406898" type="direct"><org type="institution" xml:id="struct-406898" status="VALID"><orgName>Universidad Tecnica Federico Santa Maria [Valparaiso]</orgName>
<orgName type="acronym">UTFSM</orgName>
<desc><address><addrLine>Avenida España 1680, Valparaíso</addrLine>
<country key="CL"></country>
</address>
<ref type="url">http://www.usm.cl/</ref>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>Chili</country>
</affiliation>
</author>
<author><name sortKey="Taramasco, Carla" sort="Taramasco, Carla" uniqKey="Taramasco C" first="Carla" last="Taramasco">Carla Taramasco</name>
<affiliation wicri:level="1"><hal:affiliation type="laboratory" xml:id="struct-1173" status="OLD"><orgName>Centre de recherche en épistémologie appliquée</orgName>
<orgName type="acronym">CREA</orgName>
<desc><address><addrLine>ROUTE DE SACLAY 91128 PALAISEAU CEDEX</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.crea.polytechnique.fr/LeCREA/</ref>
</desc>
<listRelation><relation name="UMR7656" active="#struct-441569" type="direct"></relation>
<relation active="#struct-300340" type="direct"></relation>
</listRelation>
<tutelles><tutelle name="UMR7656" active="#struct-441569" type="direct"><org type="institution" xml:id="struct-441569" status="VALID"><idno type="IdRef">02636817X</idno>
<idno type="ISNI">0000000122597504</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc><address><country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300340" type="direct"><org type="institution" xml:id="struct-300340" status="VALID"><orgName>Polytechnique - X</orgName>
<desc><address><country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
</affiliation>
</author>
<author><name sortKey="Astudillo, Hernan" sort="Astudillo, Hernan" uniqKey="Astudillo H" first="Hernan" last="Astudillo">Hernan Astudillo</name>
<affiliation wicri:level="1"><hal:affiliation type="laboratory" xml:id="struct-36850" status="VALID"><orgName>Departamento de Informatica [Valparaíso, Chile]</orgName>
<desc><address><addrLine>Av.España 1680 - Valparaíso</addrLine>
<country key="CL"></country>
</address>
<ref type="url">http://portal.inf.utfsm.cl</ref>
</desc>
<listRelation><relation active="#struct-406898" type="direct"></relation>
</listRelation>
<tutelles><tutelle active="#struct-406898" type="direct"><org type="institution" xml:id="struct-406898" status="VALID"><orgName>Universidad Tecnica Federico Santa Maria [Valparaiso]</orgName>
<orgName type="acronym">UTFSM</orgName>
<desc><address><addrLine>Avenida España 1680, Valparaíso</addrLine>
<country key="CL"></country>
</address>
<ref type="url">http://www.usm.cl/</ref>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>Chili</country>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">HAL</idno>
<idno type="RBID">Hal:hal-00654576</idno>
<idno type="halId">hal-00654576</idno>
<idno type="halUri">https://hal.archives-ouvertes.fr/hal-00654576</idno>
<idno type="url">https://hal.archives-ouvertes.fr/hal-00654576</idno>
<date when="2011-10-17">2011-10-17</date>
<idno type="wicri:Area/Hal/Corpus">001459</idno>
<idno type="wicri:Area/Hal/Curation">001459</idno>
<idno type="wicri:Area/Hal/Checkpoint">001B03</idno>
<idno type="wicri:explorRef" wicri:stream="Hal" wicri:step="Checkpoint">001B03</idno>
<idno type="wicri:Area/Main/Merge">002095</idno>
<idno type="wicri:Area/Main/Curation">002052</idno>
<idno type="wicri:Area/Main/Exploration">002052</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en">Cheating to achieve Formal Concept Analysis over a large formal context</title>
<author><name sortKey="Codocedo, Victor" sort="Codocedo, Victor" uniqKey="Codocedo V" first="Victor" last="Codocedo">Victor Codocedo</name>
<affiliation wicri:level="1"><hal:affiliation type="laboratory" xml:id="struct-36850" status="VALID"><orgName>Departamento de Informatica [Valparaíso, Chile]</orgName>
<desc><address><addrLine>Av.España 1680 - Valparaíso</addrLine>
<country key="CL"></country>
</address>
<ref type="url">http://portal.inf.utfsm.cl</ref>
</desc>
<listRelation><relation active="#struct-406898" type="direct"></relation>
</listRelation>
<tutelles><tutelle active="#struct-406898" type="direct"><org type="institution" xml:id="struct-406898" status="VALID"><orgName>Universidad Tecnica Federico Santa Maria [Valparaiso]</orgName>
<orgName type="acronym">UTFSM</orgName>
<desc><address><addrLine>Avenida España 1680, Valparaíso</addrLine>
<country key="CL"></country>
</address>
<ref type="url">http://www.usm.cl/</ref>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>Chili</country>
</affiliation>
</author>
<author><name sortKey="Taramasco, Carla" sort="Taramasco, Carla" uniqKey="Taramasco C" first="Carla" last="Taramasco">Carla Taramasco</name>
<affiliation wicri:level="1"><hal:affiliation type="laboratory" xml:id="struct-1173" status="OLD"><orgName>Centre de recherche en épistémologie appliquée</orgName>
<orgName type="acronym">CREA</orgName>
<desc><address><addrLine>ROUTE DE SACLAY 91128 PALAISEAU CEDEX</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.crea.polytechnique.fr/LeCREA/</ref>
</desc>
<listRelation><relation name="UMR7656" active="#struct-441569" type="direct"></relation>
<relation active="#struct-300340" type="direct"></relation>
</listRelation>
<tutelles><tutelle name="UMR7656" active="#struct-441569" type="direct"><org type="institution" xml:id="struct-441569" status="VALID"><idno type="IdRef">02636817X</idno>
<idno type="ISNI">0000000122597504</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc><address><country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300340" type="direct"><org type="institution" xml:id="struct-300340" status="VALID"><orgName>Polytechnique - X</orgName>
<desc><address><country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
</affiliation>
</author>
<author><name sortKey="Astudillo, Hernan" sort="Astudillo, Hernan" uniqKey="Astudillo H" first="Hernan" last="Astudillo">Hernan Astudillo</name>
<affiliation wicri:level="1"><hal:affiliation type="laboratory" xml:id="struct-36850" status="VALID"><orgName>Departamento de Informatica [Valparaíso, Chile]</orgName>
<desc><address><addrLine>Av.España 1680 - Valparaíso</addrLine>
<country key="CL"></country>
</address>
<ref type="url">http://portal.inf.utfsm.cl</ref>
</desc>
<listRelation><relation active="#struct-406898" type="direct"></relation>
</listRelation>
<tutelles><tutelle active="#struct-406898" type="direct"><org type="institution" xml:id="struct-406898" status="VALID"><orgName>Universidad Tecnica Federico Santa Maria [Valparaiso]</orgName>
<orgName type="acronym">UTFSM</orgName>
<desc><address><addrLine>Avenida España 1680, Valparaíso</addrLine>
<country key="CL"></country>
</address>
<ref type="url">http://www.usm.cl/</ref>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>Chili</country>
</affiliation>
</author>
</analytic>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass></textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Researchers are facing one of the main problems of the Information Era. As more articles are made electronically available, it gets harder to follow trends in the different domains of research. Cheap, coherent and fast to construct knowledge models of research domains will be much required when information becomes unmanageable. While Formal Concept Analysis (FCA) has been widely used on several areas to construct knowledge artifacts for this purpose (Ontology development, Information Retrieval, Software Refactoring, Knowledge Discovery), the large amount of documents and terminology used on research domains makes it not a very good option (because of the high computational cost and humanly-unprocessable output). In this article we propose a novel heuristic to create a taxonomy from a large term-document dataset using Latent Semantic Analysis and Formal Concept Analysis. We provide and discuss its implementation on a real dataset from the Software Architecture community obtained from the ISI Web of Knowledge (4400 documents).</div>
</front>
</TEI>
<affiliations><list><country><li>Chili</li>
<li>France</li>
</country>
</list>
<tree><country name="Chili"><noRegion><name sortKey="Codocedo, Victor" sort="Codocedo, Victor" uniqKey="Codocedo V" first="Victor" last="Codocedo">Victor Codocedo</name>
</noRegion>
<name sortKey="Astudillo, Hernan" sort="Astudillo, Hernan" uniqKey="Astudillo H" first="Hernan" last="Astudillo">Hernan Astudillo</name>
</country>
<country name="France"><noRegion><name sortKey="Taramasco, Carla" sort="Taramasco, Carla" uniqKey="Taramasco C" first="Carla" last="Taramasco">Carla Taramasco</name>
</noRegion>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Wicri/Lorraine/explor/InforLorV4/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 002052 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 002052 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Wicri/Lorraine |area= InforLorV4 |flux= Main |étape= Exploration |type= RBID |clé= Hal:hal-00654576 |texte= Cheating to achieve Formal Concept Analysis over a large formal context }}
This area was generated with Dilib version V0.6.33. |